Partitioning Similarity Graphs: A Framework for Declustering Problems
نویسندگان
چکیده
Declustering problems are well-known in the databases for parallel computing environments. In this paper, we propose a new similarity-based technique for declustering data. The proposed method can adapt to the available information about query distribution (e.g. size, shape and frequency) and can work with alternative atomic data-types. Furthermore, the proposed method is exible and can work with alternative data distributions, data sizes and partition-size constraints. The method is based on max-cut partitioning of a similarity graph deened over the given set of data, under constraints on the partition sizes. It maximizes the chances that a pair of atomic data-items that are frequently accessed together by queries are allocated to distinct disks. We describe the application of the proposed method to parallelizing Grid Files at the data page level. Detailed experiments in this context show that the proposed method adapts to query distribution and data distribution, and that it outperforms traditional mapping-function-based methods for many interesting query distributions as well for several non-uniform data distributions.
منابع مشابه
cient Disk Allocation for Fast Similarity Searching
As databases increasingly integrate non-textual information it is becoming necessary to support eecient similarity searching in addition to range searching. Recently, declustering techniques have been proposed for improving the performance of similarity searches through parallel I/O. In this paper, we propose a new scheme which provides good declus-tering for similarity searching. In particular...
متن کاملA Similarity Graph-Based Approach to Declustering Problems and Its Application towards Paralleling Grid Files
We propose a new similarity-based technique for declustering data. The proposed method can adapt to available information about query distributions, data distributions, data sizes and partition-size constraints. The method is based on max-cut partitioning of a similarity graph deened over the given set of data, under constraints on the partition sizes. It maximizes the chances that a pair of da...
متن کاملIterative-improvement-based declustering heuristics for multi-disk databases
Data declustering is an important issue for reducing query response times in multi-disk database systems. In this paper, we propose a declustering method that utilizes the available information on query distribution, data distribution, data-item sizes, and disk capacity constraints. The proposed method exploits the natural correspondence between a data set with a given query distribution and a ...
متن کاملParallelizing Spatial Databases on Shared-Memory Multiprocessors
Several emerging visualization applications such as ight simulators, distributed interactive simulation (DIS), and virtual reality are using geographic information systems (GISs) for high-delity representation of actual terrains. These applications impose stringent performance and response-time restrictions which require parallelization of the GIS and shared-memory multiprocessors (SMPs) are we...
متن کاملDesign and Evaluation of a Method for Partitioning and Offloading Web-based Applications in Mobile Systems with Bandwidth Constraints
Computation offloading is known to be among the effective solutions of running heavy applications on smart mobile devices. However, irregular changes of a mobile data rate have direct impacts on code partitioning when offloading is in progress. It is believed that once a rate-adaptive partitioning performed, the replication of such substantial processes due to bandwidth fluctuation can be avoid...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Inf. Syst.
دوره 21 شماره
صفحات -
تاریخ انتشار 1996